Sentence Level Paraphrase Identification System for Tamil Language
نویسندگان
چکیده
منابع مشابه
Sentence Level Dialect Identification for Machine Translation System Selection
In this paper we study the use of sentencelevel dialect identification in optimizing machine translation system selection when translating mixed dialect input. We test our approach on Arabic, a prototypical diglossic language; and we optimize the combination of four different machine translation systems. Our best result improves over the best single MT system baseline by 1.0% BLEU and over a st...
متن کاملHandwritten Document Retrieval System for Tamil Language
The paper attempts to create a handwritten document retrieval system suitable for Tamil language, with a view to record traditional literature content for future reference. It projects a search mechanism to access the query word images using a statistical model based methodology. The scheme revolves around a well defined procedure which results in word models from where the search word can be r...
متن کاملClause Boundary Identification for Tamil Language Using Dependency Parsing
Clause boundary identification is a very important task in natural language processing. Identifying the clauses in the sentence becomes a tough task if the clauses are embedded inside other clauses in the sentence. In our approach, we use the dependency parser to identify the boundary for the clause. The dependency tag set, contains 11 tags, and is useful for identifying the boundary of the cla...
متن کاملLanguage Modeling with Sentence-Level Mixtures
This paperintroduces a simple mixtare language model that attempts to capture long distance conslraints in a sentence or paragraph. The model is an m-component mixture of Irigram models. The models were constructed using a 5K vocabulary and trained using a 76 million word Wail Street Journal text corpus. Using the BU recognition system, experiments show a 7% improvement in recognition accuracy ...
متن کاملMorpheme Based Language Model for Tamil Speech Recognition System
This paper describes the design of a morpheme based language model for Tamil language. It aims to alleviate the main problems encountered in processing the Tamil language, like enormous vocabulary growth caused by large number of different forms derived for one word. The size of the vocabulary is reduced by decomposing the words into stems and endings and storing these sub word units (morphemes...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Darshan Institute on Engineering Research & Emerging Technology
سال: 2018
ISSN: 2320-7590
DOI: 10.32692/ijdi-eret/7.1.2018.1805